Skip to content

libnvme: fix nvme_scan_topology() treating missing NVMe sysfs dirs as fatal#3153

Closed
Copilot wants to merge 2 commits intomasterfrom
copilot/fix-nvme-cli-test-failure
Closed

libnvme: fix nvme_scan_topology() treating missing NVMe sysfs dirs as fatal#3153
Copilot wants to merge 2 commits intomasterfrom
copilot/fix-nvme-cli-test-failure

Conversation

Copy link
Copy Markdown

Copilot AI commented Mar 10, 2026

nvme_scan_topology() failed when /sys/class/nvme or /sys/class/nvme-subsystem didn't exist (i.e., no NVMe devices present), causing nvme_scan() to propagate the error and the Python global_ctx() constructor to return NULL — breaking the python-create-ctrl-object and python-sigsegv-during-gc tests.

Root cause

nvme_scan_ctrls() and nvme_scan_subsystems() wrap scandir(), which returns -1 with errno=ENOENT when the target directory doesn't exist. The prior code unconditionally treated any negative return as fatal:

ctrls.num = nvme_scan_ctrls(&ctrls.ents);
if (ctrls.num < 0) {
    nvme_msg(ctx, LOG_DEBUG, "failed to scan ctrls: %s\n",
             nvme_strerror(-ctrls.num));  // wrong: -(-1) = 1 = EPERM
    return ctrls.num;                    // propagates -1 as fatal error
}

Fix

  • Capture errno immediately after each failing scandir call
  • Treat ENOENT as zero entries — a missing sysfs directory is a valid "no devices" state, not an error
  • Return an error only for unexpected errno values
  • Fix the error message to use strerror(err) instead of nvme_strerror(-ctrls.num) which was deriving a nonsensical error code
ctrls.num = nvme_scan_ctrls(&ctrls.ents);
if (ctrls.num < 0) {
    int err = errno;
    if (err != ENOENT) {
        nvme_msg(ctx, LOG_DEBUG, "failed to scan ctrls: %s\n", strerror(err));
        return -err;
    }
    ctrls.num = 0;
}

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

…cefully

When no NVMe devices are present, /sys/class/nvme and /sys/class/nvme-subsystem
directories may not exist. scandir() returns -1 with errno=ENOENT in this case.
Previously, nvme_scan_topology() treated this as a fatal error, causing nvme_scan()
to fail and the Python global_ctx() constructor to return NULL.

Fix: treat ENOENT from scandir as "no devices found" (0 entries) rather than an
error. Only treat other errno values as actual failures. Also fix error messages
to use strerror(errno) instead of nvme_strerror(-ctrls.num) which was computing
the wrong error code.

This fixes the python-create-ctrl-object and python-sigsegv-during-gc tests
that fail when no NVMe devices are present in the test environment.

Co-authored-by: igaw <[email protected]>
Copilot AI changed the title [WIP] Fix nvme-cli test failure in create-ctrl-object libnvme: fix nvme_scan_topology() treating missing NVMe sysfs dirs as fatal Mar 10, 2026
Copilot finished work on behalf of igaw March 10, 2026 12:45
@igaw igaw closed this Mar 10, 2026
@igaw igaw deleted the copilot/fix-nvme-cli-test-failure branch March 10, 2026 13:02
@igaw
Copy link
Copy Markdown
Collaborator

igaw commented Mar 10, 2026

Proper fix is #3156

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants